Covering a Circular String with Substrings of Fixed Length

نویسندگان

  • Art M. Duval
  • William F. Smyth
چکیده

A nonempty circular string C(x) of length n is said to be covered by a set U k of strings each of xed length k n ii every position in C(x) lies within an occurrence of some string u 2 U k. In this paper we consider the problem of determining the minimum cardinality of a set U k which guarantees that every circular string C(x) of length n k can be covered. In particular, we show how, for any positive integer m, to choose the elements of U k so that, for suuciently large k, u k k?m , where u k = jU k j and is the size of the alphabet on which the strings are deened. The problem has application to DNA sequencing by hybridization using oligonucleotide probes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

De Bruijn Sequences for Fixed-Weight Binary Strings

De Bruijn sequences are circular strings of length 2n whose length n substrings are the binary strings of length n. Our focus is on creating circular strings of length (

متن کامل

Fixed-density De Bruijn Sequences

De Bruijn sequences are circular strings of length 2 whose substrings are the binary strings of length n. Our focus is on de Bruijn sequences for binary strings that have the same density (number of 1s). We construct circular strings of length ( n−1 d )

متن کامل

An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data

Finding similar substrings/substructures is a central task in analyzing huge amounts of string data such as genome sequences, web documents, log data, etc. In the sense of complexity theory, the existence of polynomial time algorithms for such problems is usually trivial since the number of substrings is bounded by the square of their lengths. However, straightforward algorithms do not work for...

متن کامل

Generating Necklaces and Strings with Forbidden Substrings

Given a length m string f over a k-ary alphabet and a positive integer n, we develop eecient algorithms to generate (a) all k-ary strings of length n that have no substring equal to f, (b) all k-ary circular strings of length n that have no substring equal to f, and (c) all k-ary necklaces of length n that have no substring equal to f, where f is an aperiodic necklace. Each of the algorithms ru...

متن کامل

A Parallel Algorithm for the Fixed-length Approximate String Matching Problem for High Throughput Sequencing Technologies

The approximate string matching problem is to find all locations at which a query of length m matches a substring of a text of length n with k-or-fewer differences. Nowadays, with the advent of novel high throughput sequencing technologies, the approximate string matching algorithms are used to identify similarities, molecular functions and abnormalities in DNA sequences. We consider a generali...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Found. Comput. Sci.

دوره 7  شماره 

صفحات  -

تاریخ انتشار 1996